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ABSTRACT 



o 



Context. The diversification of galaxies is caused by transforming events such as accretion, interaction, or mergers. These explain 
the formation and evolution of galaxies, which can now be described by many observables. Multivariate analyses are the obvious 
tools to tackle the available datasets and understand the differences between different kinds of objects. However, depending on the 
method used, redundancies, incompatibilities, or subjective choices of the parameters can diminish the usefulness of these analyses. 
The behaviour of the available parameters should be analysed before any objective reduction in the dimensionality and any subsequent 
i-C clustering analyses can be undertaken, especially in an evolutionary context. 

Aims. We study a sample of 424 early-type galaxies described by 25 parameters, 10 of which are Lick indices, to identify the most 
discriminant parameters and construct an evolutionary classification of these objects. 
Methods. Four independent statistical methods are used to investigate the discriminant properties of the observables and the parti- 
tioning of the 424 galaxies: Principal component analysis, K-means cluster analysis, minimum contradiction analysis, and Cladistics. 
Results. The methods agree in terms of six parameters: central velocity dispersion, disc-to-bulge ratio, effective surface brightness, 
metallicity, and the line indices NaD and OI11. The partitioning found using these six parameters, when projected onto the fundamen- 
tal plane, looks very similar to the partitioning obtained previously for a totally different sample and based only on the parameters of 
the fundamental plane. Two additional groups are identified here, and we are able to provide some more constraints on the assembly 
history of galaxies within each group thanks to the larger number of parameters. We also identify another "fundamental plane" with 
the absolute K magnitude, the linear diameter, and the Lick index Hp. We confirm that the Mg b vs velocity dispersion correlation 
^\ ' is very probably an evolutionary correlation, in addition to several other scaling relations. Finally, combining the results of our two 

\Q t papers, we obtain a classification of galaxies that is based on the transforming processes that are at the origin of the different groups. 

Conclusions. By taking into account that galaxies are evolving complex objects and using appropriate tools, we are able to derive an 
, — ! 1 explanatory classification of galaxies, based on the physical causes of the diverse properties of galaxies, as opposed to the descriptive 

classifications that are quite common in astrophysics. 

Key words, galaxies: elliptical and lenticular, cD - galaxies: evolution - galaxies: formation - galaxies: fundamental parameters - 
methods: statistical 
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1 . Introduction classifications do not make full use of the wealth of information 

H ■ that observations and numerical simulations provide. 

C3 ■ 

Galaxies are complex and evolving objects. Their diversity ap- Multivariate partitioning analyses appear to be the most 

pears to increase rapidly with the instrumental improvements appropriate tools. One basic tool, the Pri ncipal compo- 

that produce huge databases. A good understanding of the nent a nalysis, is relativel y well- known (e.g. Cab anac et al.L 

physics governing the processes at work within and between the 2002; iRecio-Blanco et al 1 I2006h . but this is not a clus- 

different components of galaxies requires numerical simulations tering tool in itself. Many attempts to apply multivari 

that produce synthetic populations of hopefully realistic objects, ate c lust ering methods have been mad e ( e.g. | Ellis et al 

The number of physical processes that may operate together with 2005; Chattopadhvav & Chattopad hvavl 120061 [2 007 

their infinite possible configurations, render the morphological Chattopadhyay et all 120071 120081 l2009allbc iFraix -Burnet et al 



Hubble classification and its equivalents obviously too simple. 2009; Sanchez Almeida et al J, |201o[ iFraix-BurneietalJ, |2010)' 



Morphology, as detailed as it can be determined in the visible, is Sophisticated statistical tools are used in some areas of as- 
only one component of the physics of galaxies, and ignores many trophysics and are being developed steadily, but multivariate 
ingredients of galaxy evolution, such as kin ematics and chemi- analysis and clustering techniques have not yet been widely 
cal composition (e.g. Capp ellari et all 1201 ll) . In addition, these applied across the community. It is true that the interpretation 
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of the results is not always easy. Some of the reasons are given 
below. 

Before using the available parameters to derive and com- 
pare the physical properties of galaxies, it is important to check 
whether they can discriminate between different kinds of galax- 
ies. A partitioning of objects into robust groups can only be ob- 
tained with discriminant parameters. This does not necessarily 
preclude using other information to help the physical and evolu- 
tionary interpretation of the properties of the groups and the re- 
lationships between groups. Among the descriptors of galaxies, 
many come directly from the observations independently of any 
model. In principle, all the information is contained within the 
spectrum. However, since it is a huge amount of information, it is 
usually summarized by broad-band fluxes (magnitudes), slopes 
(colours), medium-band and line fluxes (e.g. the Lick indices). 
This dimensionality reduction is often guided by observational 
constraints or some physical a priori, but not by discriminant (i.e. 
statistical) properties. 

Multivariate partitionings group objects according to global 
similarities. They yield a descriptive classification of the diver- 
sity, but do not provide any explanation of the differences in 
properties between groups. Modelling or numerical simulations 
must be used to understand physically the partitioning and the 
relationships between the groups. 

However, galaxy properties are indeed explained by evolu- 
tion. Mass, metallicity, morphology, colours, etc, are all the re- 
sult of galaxy evolution. It can thus be expected that the rela- 
tionships between the groups are driven by evolution. In addi- 
tion, galaxy (formation and) evolution proceeds through a lim- 
ited number of transforming processes (monolithic collapse, sec- 
ular evolution, gravitati onal interaction, accretio n/merger, and 
sweeping/ejection, see iFraix-B urnet et al ., 2006b c). Since they 
depend on so many parameters (initial conditions, nature of the 
objects involved, impact parameters, ...), the outcomes of each 
of them vary a lot, so that diversity is naturally created through 
evolution of the galaxy populations. This is what we call diver- 
sification. 

It is easy to see that each of these transforming processes 
follows a "transmission with modification" scheme, because a 
galaxy is made of stars, gas, and dust, which are both transmitted 
and mod ified during the transforming event (IFraix-Burnet et all 
2006b c. a). This is why one might expect a hierarchical or- 
ganization of galaxy diversity with evolutionary relationships 
betwe en groups. Cladistics has been shown to be an ade- 
quate ( Fraix-Burnet et al.L 2006bllcllal) and quite effective (e.g. 
IFraix-Burnet et al. . 120091 l2010ir tool for identifying this hier- 
archical organization. Instead of a descriptive classification of 
galaxy diversity, we hope to be able to build an explanatory clas- 
sification that is physic ally more informative. 

In a previous work (IFraix-Burnet et al .,2010), we found that 
the fundamental plane of early-type galaxies is probably gener- 
ated by diversification. We believe that this result can be of such 
importance as to deserve dedicated studies to assess its robust- 
ness. The present work is novel in several fundamentall ways: 

- A distinct data set is used, for which more parameters are 
available, useful for the analyses themselves and also for the 
subsequent interpretation, once the groups are determined 
(Sect. ED, 

- Two additional methods, PCA and MCA, are used 
(Sect. 122V 

- A new set of parameters is used for the various partition- 
ing methods (Sect. 0), which are selected in a rather objec- 
tive way (Sect. [3), unlike the ad hoc selection of param- 



eters from longtime con ventional wisdom (as followed in 
IFraix-Burnet et all l2010b . 

- Measurement errors are used in the classification in the case 
of the cladistic analysis (Sect. I2TI and Appendix lB.2l i. 

- The co mbination of the part i tioning s in the present paper and 
those in IFraix-Burnet etaL 

\$2M<$ help us to devise a new 
scheme for galaxy classification. 



We present the data in Sect. l2.1l before describing the philos- 
ophy of our approach with the different methods used to analyse 
the discriminant properties of the parameters (i.e. their ability 
to discriminate between different groups) and the partitioning of 
the sample in Sect. 12.21 We then give the results of these analy- 
ses and the "winning" set of parameters (Sect. [3} that is used for 
the partitioning (Sect. |4j>. We then comment on the discriminant 
parameters (Sect. |5) and detail the group properties (Sect. |6j. 
Scaling relations, correlations, and scatter plots are presented in 
Sect. 13 and we discuss the well-known fundamental plane of 
early-type galaxies as well as another fundamental plane, which 
we discover in this paper, in Sect. 17.41 Finally, we com bine our 
present result with the one in lFraix-Burnet et al. | dMH by plot- 
ting a cladogram that summarizes the inferred assembly histories 
of the galaxies and thus is a tentative new scheme for classifying 
galaxies (Sect. [8]). The conclusion of this study closes this paper 
(Sect.©. 



2. Data and methods 

2.1. Data 

We selected 424 fully documented galaxies from the sample of 
509 e arly-type galaxies in the local Universe of Ogando et al. 
(2008). As these authors point out, this sample appears to be 
relatively small compared to those at intermediate redshifts that 
have been obtained with large surveys (such as the Sloan Digital 
Sky Survey). However, they do have the advantage of higher 
quality spectroscopic data and more reliable structural informa- 
tion such as the effectiv e radiu s. To describe the galaxies, we 
took from Og ando et al.l d2008l) the 10 p arameters that belong 
to the set of 25 Lick indices defined by Worthev & Ottaviani 
(119971) : H/3, Fe50\5, Mgl, Mg 2 , Mgb, Fe5270, Fe5335, 
Fe5AQ6, Fe51Q9, and NaD. From these Lick indices, we 
computed two other parameters defined as: [MgbFe]' = 
^Mgb * (0.72 * Fe5270 + 0.28 * Fe5335) dThomas et all 
120031) and Mgb/Fe= Mgb/ (i(Fe5270 + Fe5335)) (IdonzalesL 
1 19971) . which are indicators of metallicity and light-element 
abundance, r e spect ively. The other parameters taken from 
Ogan do etalJ (2008) were the number of companions n c , 
the morphological type T, the line index OIII, the velocity 
dispersion (log <x), and the linear effective radius (log r e ). 

The surface brightness within the effective radius (Bri K ) 
and th e disc-to-bulge ratio (D/B) were taken from lAlonso et al.l 
d2003h . The absolute magnitude in B (M a t, s ) and the distance 
of the galaxies were taken from HyperledcQ, which adopts a 
Hubble constant of 70 km/s/Mpc. The distances to three galax- 
ies not available in Hyperled a were taken from the literature : 
NGC 1400 (27.7 Mpc, fromlPerrettetalJ d 19971) ). NGC 4550 



(15.4 9 Mpc fromlMei et al.1 (120071) 1. and NGC 5206 (3.6Mpc 



from iKarachentsev et al" d2002l) V The colour B-R was calcu- 
lated from the corrected apparent B magnitude i n Hyperleda 
and the total R magnitude given by lAlonso et al.l d2003l) . The 
linear diameter (\og{diam)) was computed from logdc given in 



http://leda.univ-lyon 1 .fr/ 
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Hyperleda. The infrared magnitudes and colours were taken or 
calculated from NEE0. 

Altogether, we have 25 parameters to describe the 424 galax- 
ies. However, two parameters were removed from our analyses, 
namely the number of companions n c and the morphological 
type T, which are both discrete parameters. More importantly, 
n c is not a property of the galaxies, but their local environment, 
while T is qualitative and subjective. The full set of 25 parame- 
ters is naturally used to interpret our results. 

We thus used 23 parameters for the analyses in this paper: 
three are geometrical {D/B, logr e , and \og(diam)), two come 
from medium-resolution spectra (log <x and OHI), in addition to 
the ten Lick indices (HB, Fe5015, Mgl, Mg 2 , Mgb, fe5270, 
Fe5335, Fe5406, Fe5709, and NaD) and [MgbFe]', Mgb/Fe, 
the six others are broad-band observables (Bri e , the absolute 
magnitudes in B (M a b s ) and K (K a i, s ), the total colours B-R, J-H, 
an d H-K). 

lOg ando et al. (2008) provides error bars for each of these 
parameters. However, evaluating the influence of measurement 
errors on the partitioning is difficult because their multivariate 
distribution function is unknown. This appears to be a big sta- 
tistical problem. Fuzzy cluster analyses could perhaps be useful 
but they are quite complicated to implement, and the very good 
agreement between all our results indicate that such an invest- 
ment is unnecessary at this point. In addition, measurement un- 
certainties can easily be integrated into the cladistic analysis. We 
thus limit ourselves to two restricted assessments: the influence 
of two determinations of the distances of galaxies (needed to de- 
termine r e ) on the result, and the cladistic analysis with errors. 
These are described in Appendix|B] In any case, one should con- 
sider that the physical nature of galaxies implies that there are 
continuous variations in the parameters, hence the partitions are 
necessarily fuzzy with no rigid boundaries between groups. This 
implies that there is some uncertainty in the placement of the 
individual objects in the multivariate parameter space. 

2.2. Methods 

The philosophy of our approach is to use multivariate tools in a 
first step to select the parameters that can discriminate different 
groups within the whole sample. These parameters, called dis- 
criminant parameters, are then used in a second step to partition 
the data into several groups. 

In this paper, we use four methods, which are described in 
more details in Appendix [A] Three of them are used to analyse 
the parameters: Principal component analysis (PCA, Sect. lA.U . 
minimum contradiction analysis (MCA, Sect. IA.2K and cladis- 
tics (Sect. IA.3b . while the groupings are performed with the two 
latter (MCA and cladistics) together with a cluster analysis (CA, 
Sect. ED, 

The four approaches are all very different in philosophy and 
technique. Since there is no ideal statistical method, it is use- 
ful to compare results obtained with these independent methods. 
Convergence improves confidence, but since the assumptions be- 
hind the different techniques are different, exact agreement can- 
not be expected. In the end, it is the physics that decides whether 
a partitioning is informative. 

3. Analyses of the parameters 

In this section, we investigate the behaviour of the observables 
using three of the methods presented above: PCA, MCA, and 

2 http ://nedw w w. ipac.caltech.edu/ 
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Fig. 1. PCA eigenvectors for the sample of 424 objects with 23 
parameters. The eigenvalue 1 is indicated by the horizontal line 
and the eight eigenvectors higher than 1 or so are darkened. 

cladistics. These three multivariate techniques use the parame- 
ters directly instead of distance measures, as in the cluster anal- 
ysis also considered in this paper. Hence, much information can 
be gained about the parameters themselves, such as their corre- 
lations (PCA) or their respective behaviour in the partitioning 
process (MCA and cladistics). From this information, one can 
infer the discriminant power of all parameters for the studied 
sample. 

3. 1 . Principal component analysis 

We performed a PCA analysis (Sect. IA.lt on the set of 23 pa- 
rameters. Six principal components (PC or eigenvectors) have 
eigenvalues greater than 1 and two others are very close to 1 
(Fig. [TJ, so that eight components describe most (82%) of the 
variance of the sample while the first five account for 69% of the 
variance. 

The loadings (i.e. the coefficients of the parameters com- 
posing the eigenvectors, Table I A. 1 1 give some indications as to 
which parameters are correlated, redundant, or discriminant. The 
most important parameters (the first few with the highest load- 
ings in the first eigenvector, and the first parameter for the other 
eigenvectors) in each PC are: 

1. Mgb, logo-, Mg 2 , [MgbFe]', NaD, Mgi 

2. Bri e 

3. 01 1 1 

4. Bri e 

5. H-K 

6. J-H 

7. D/B 

8. Fe5709 

Si nce the parameters M g b, Mgl, and Mg2 are so closely re- 
lated dBurstein et all 1 1 9841) . these three quantities are undoubt- 
edly redundant. Moreover, [MgbFe]' depends very much on 
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Mg b, and is considered a more accurate estimate of the metallic - 
ity of a galaxy. Hence, the most important a priori non-redundant 
parameters are: logcr, [MgbFe]', NaD, Bri e , 0111, H-K, J-H, 
D/B, and F<?5709. 

We then performed a second PCA analysis after removing 
the supposedly redundant parameters (Mgb, Mgl, Mg2, M a b s ) 
and log r e and log(af/ara)that are affected by the uncertainties in 
the distance determination (see Appendix IB.2l i. We also disre- 
garded J-H, which has an outlier and otherwise quite constant 
values. We note that somewhat paradoxically this behaviour 
could explain why J-H appears in the sixth principal compo- 
nent above. We are now left with the 16 parameters log cr, Bri e , 
D/B, HB, Fe5015, Fe5270, Fe5335, Fe5406, Fe5709, NaD, 
0111, H-K, B-R, K abs , [MgbFe]', and Mgb IF e. Five eigenvec- 
tors have eigenvalues higher than 1 and account for 68% of the 
variance. The most important parameters are now: 

1. [MgbFe]', log cr, NaD 

2. Fe5Q\5 

3. Bri e 

4. Fe5709 

5. H-K 

The agreement is very good with [MgbFe]', logcr, NaD, 
Bri e , Fe5109, and H-K still present, while OIII has disappeared 
but has loadings very close to those of Fe5Q\5 and Fe5109 in 
components 2 and 4, respectively. The disc-to-bulge ratio D/B 
does not appear to be as important in this analysis. 

3.2. Minimum contradiction analysis 

The MCA analysis (Sect. IA.2I ) uses all parameters and explores 
them to determine the best order (partitioning) that can be ob- 
tained. It is possible to derive the discriminant capacity of the 
pa rameters according to their respe ctive behaviour as formalised 
in lThuillard & Fraix-Burnetl(l2009l) . We find that: 

- logcr, Fe5270, NaD, [MgbFe]', Bri e , B-R, OIII, and D/B 
appear as discriminant parameters; 

- log cr, K a b s , \og{diam) are strongly correlated. 

In contrast to PCA, the correlations are not automatically re- 
moved, some or all of them may remain. In the present case, 
the three strongly correlated parameters are not obviously re- 
dund ant since they are not related by a direct causal relation 
(see iFraix-Burnetl 1201 ll) . However, keeping all of them for the 
MCA analysis does not bri ng any more discriminating info rma- 
tion than keeping only one (iThuillard & Fraix-Bu rnet, 2009J. As 
a consequence, since log cr is listed as one of the discriminant pa- 
rameters, K a i,s and \og(diam) can be disregarded in the analysis. 

Consequently, the MCA analysis finds eight discriminant pa- 
rameters. 

3.3. Cladistic analysis 

Each cladistic analysis (Sect. \A.3i uses and investigates a given 
set of parameters. To improve our understanding of the 23 pa- 
rameters, it would be necessary to analyse all possible sub- 
sets. Since this require too much computing time, we decided 
to eliminate obvious redundancies (Mgb, Mgl, Mg2, Mobs> and 
\og(diam)). The index HB, which is an age indicator for stellar 
populations older than a few hundred Myr, is problematic for 
cladistics because age is a property of all groups. This parameter 
might be able to trace recent transformative events accompanied 
by starbursts, if it were not for the degeneracy between the age of 



Table 1. Subsets of parameters used in cladistic analyses to de- 
termine the most discriminant parameters. The names of the sub- 
sets include the number of parameters. 



Subset Parameters 



4cA 


log 


cr D/B NaD [MgbFe]' 


5c 


log 


cr D/B NaD [MgbFe]' Bri e 


5cA 


log 


a D/B NaD [MgbFe]' Mgb 


6c 


log 


cr D/B NaD [MgbFe]' Bri e OIII 


6cA 


log 


cr D/B NaD [MgbFe]' Mg b Bri e 


7c 


log 


cr D/B NaD [MgbFe]' Bri e OIII Fe5015 


8c 


log 


cr D/B NaD [MgbFe]' Bri e OIII Fe5015 Mgb 


10c 


log 


cr D/B NaD [MgbFe]' Bri e OIII Mg b Fe5270 






F<?5709 B-R 



a younger stellar component and its relative contribution to the 
total stellar mass or luminosity. Guided by the PCA and MCA 
analyses, we also disregarded F <?5335, Fe5406, H-K, Mg b/Fe, 
and J-H. It is remarkable that log r e is not found in the PCA and 
MCA analyses as a discriminant parameter. We thus also disre- 
garded it here, but kept it for a specific analysis of the fundamen- 
tal plane together with lo g cr, Bri e , and Mg2 (see Sect. I7.4.T1 and 
iFraix-Burnet et alll2010l) . 

Finally, we studied in more detail the remaining eleven 
parameters: logcr, D/B, NaD, [MgbFe]', Bri e , OIII, Mgb, 
Fe5015, Fe5270, Fe5709, and B-R. To find the most discrim- 
inant ones in this list, we examined the relative robustness of 
the trees obtained by cladistic analyses using the eight subsets 
of these parameters listed in Table Q] Analyses of each parame- 
ter subset were performed with the full sample and several sub- 
samples, and all the results were compared. The details of our 
procedure are presented in Sect. IA.3I 

Five or six discriminant parameters are favored by the cladis- 
tic analysis because the results are then more stable. The trees 
from subsets 5c and 6c are in very good agreement, that of 6c 
being almost entirely structured, and the result for subset 6c is in 
very good agreement with the cluster analysis (Sect. |4}. 

We conclude that the most discriminant set of parameters 
from the cladistic analyses is that of 6c, namely logcr, D/B, 
NaD, [MgbFe]', Bri e , and OIII. 

3.4. Final set of discriminant parameters 

The PCA, MCA, and cladistic analyses agree in terms of the 
five parameters logcr, [MgbFe]', NaD, Bri e , and OIII, while 
they globally identify five to eight discriminant parameters. The 
cladistic and MCA analyses find that D/B is an important pa- 
rameter, which appears only weakly in PCA. B-R is discriminant 
in MCA only, while the iron indices appear with Fe5015 and 
Fe5709 on one side (PCA), and Fe5270 on the other (MCA). 
None of these four parameters are preferred in the cladistic anal- 
yses. 

Hence, we select the consensual six parameters log cr, D/B, 
NaD, [MgbFe]', Bri e , and OIII for partitioning analyses of our 
sample. 

4. Partitioning the sample galaxies 

We now compare the partitioning obtained with four methods: 
a cluster analysis using eight principal components (Sect. 14. U . 
a cluster analysis (Sect. 14. 2I >. a MCA optimisation (Sect. 14.3b . 
and a cladistic analysis (Sect. 14.4b . the latter three using the six 
parameters listed in Sect. 13.41 The partitionings are compared at 
the end of this section and in Fig. [2] The order of the groups for 
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PCA+CA Cluster Analysis Cladistic Analysis 




1 2 3 1234567 01234567 



Fig. 2. Comparison of analyses with three different methods: 
PCA+CA (left panels), cluster analysis (middle panels), and 
cladistics (right panels). The MCA result is not shown since it 
can be easily compared to the cladistic result (see Sect. I4.5I ). 
In the first row, the colours identify the eight groups found 
in the cladistic analysis; in the second row, they identify the 
seven groups of the cluster analysis and, in the third row, the 
three groups found in PCA+CA. The colours for the cluster and 
PCA+CA groups are chosen to more easily visualize the agree- 
ment with the cladistic partitioning. 



cladistics is essentially dictated by the tree (and its rooting, see 
Sect. I4.4I ). while for the other methods, the order was arbitrarily 
chosen to correspond as much as possible to the cladistic order. 

4. 1 . PCA plus cluster analysis 

A cluster analysis (Sect. lA~4l was performed using the eight PCs 
obtained by PCA (Sect. I3.ll and Sect. IA.1I ). In this paper, this 
analysis is denoted PCA+CA. Three groups are found and la- 
belled PCACA1, PCACA2, and PC AC A3. The first two contain 
about 100 objects, while the third is about twice as big (Fig. [2j. 

As noted in Sect. IA.1I the use of principal components in 
multivariate clustering very likely obscures a significant part of 
the underlying physics since it suppresses all correlations, even 
those that are due to hidden parameters or independent evolu- 
tions (see Sect.|6]l. We present this result here mainly as an illus- 
tration of this point. 

4.2. Cluster analysis 

A cluster analysis (Sect. IA.4t was performed with the six pa- 
rameters listed in Sect. 13.41 Seven groups were found and named 
Clusl to Clus7. There are three large groups, with about 80 to 
120 objects. The other contain from 20 to 40 objects (Fig. 0. 




Fig. 3. Most parsimonious tree found with cladistics with 
the identification of the eight groups and their corresponding 
colours. 



4.3. Minimum contradiction analysis 

With the six parameters listed in Sect. 13.41 the MCA analysis 
performs an optimisation of the order to minimise the contradic- 
tion (Sect. lAT2l . The result is four groups, and maybe two others. 
The groups are globally very fuzzy, i.e. they have no sharp limits. 
This is expected because of the continuous nature of the param- 
eters, and because of both uncertainties and measurement errors. 
This is an important point that is essentially overlooked by the 
other methods, and that should be kept in mind. 

As we see below, these four groups are easily identified with 
the groups obtained by cladistics, and for this reason they are not 
given labels in this paper. 

4.4. Cladistic analysis 

The cladistic analysis, performed with the six parameters se- 
lected in Sect. 13.41 produces a most parsimonious tree (shown 
in Fig.O, on which we can identify groups. There is no absolute 
rule for defining groups on a cladogram. However, substructures 
in the tree are a good guide. 

We identified eight groups in this tree, three large ones with 
more than 80 objects, an intermediate group with about 50 ob- 
jects, and four smaller ones with fewer than 30 members (Fig. [3}. 
These groups are named "Cladl" (the most ancestral one, at the 
top) to "Clad8" (at the bottom) (Fig. [3]). This numbering and 
presentation of the tree should not a priori be seen as a diversifi- 
cation arrow since branches can be switched graphically. It is the 
physical interpretation that both confirms the possible ancestral- 
ly of group Cladl and gives the right order of diversification. 

The rooting of the tree (i.e. the choice of objects that appear 
graphically at the top of the tree and are supposed to be the clos- 
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est to the common "ancestor species" of all the objects of the 
sample) is necessary to define the direction of diversification, 
and in general affects the contours of the groups. The tree is here 
rooted with the group of the lowest average metallicity as mea- 
sured by [MgbFe]' according to our assumption that the lowest 
metallicity corresponds to the most ancestral objects. This guess 
is monovariate and might not represent the best choice in a mul- 
tivariate study like the present one. However, we do not yet have 
a better multivariate criterion for primitiveness. The rooting of 
the tree can easily be changed. 

In contrast to the other partitioning methods, some objects 
appear to be isolated on the tree and consequently cannot be eas- 
ily grouped with others. Each of them could indeed represent a 
class, but, for the sake of simplicity, we decided not to identify 
them with specific colours. We simply gather all such objects 
as CladO, give them a grey colour in plots or simply disregard 
them in the discussions that deal with the statistical properties of 
groups. 

4.5. Comparison of the four partitionings 

The four methods produce three (PCA+CA), four (MCA), seven 
(cluster analysis), and eight (cladistics) groups. They thus all 
agree for a relatively small number of groups. 

The agreement between cladistics and PCA+CA is quite 
good (see Fig. 13, if we identify the three following groups: 
(Cladl,Clad2,Clad3,Clad4,Clad5, and a pail of Clad6), (Clad7), 
and (Clad6,Clad8) with PCACA1, PCACA2, and PCACA3 re- 
spectively. 

The agreement between the PCA+CA and cluster analyses is 
also quite good with PCACA3 being composed of Clus4, Clus7, 
and a part of Clus6, PCACA2 being composed essentially of 
Clus5 and partly Clus6, and PCACA1 being mainly composed 
of Clusl, Clus3, and part of Clus4. 

Cluster and cladistic partitionings agree very well for the 
three large groups (Clus4^Clad6, Clus5+Clus6-Clad7, and 
Clus7^Clad8). The situation is slightly more complicated for 
the other groups, but still convergent with Clusl^Cladl+Clad3, 
and Clad2 being split mainly into Clus2, Clus3, and also Clus4. 
Conversely, Clus2 is contained mainly in Clad2 and also in 
Clad7. 

The four groups from MCA are in very good correspondence 
with groups Clad6, Clad7, Clad8, and Cladl+Clad3. On the 
other hand, Clad2+Clad5 does not seem well -justified based on 
the MCA result. Interestingly, Clad6 and Clad8 are not quite in- 
dependent, in agreement with PCACA3 being mainly composed 
of Clad6 and Clad8 as seen above. 

The PCA+CA identifies a smaller number of groups than 
the other partitionings do. This was expected, because of the 
effect of the PCA analysis, which eliminates too many of the 
correlations (Sect. IA.U . The MCA result reinforces the signifi- 
cance of groups Clad6, Clad7, Clad8, and Cladl+Clad3, which 
are all also identified in the cluster analysis. The other groups 
from cladistics and cluster analyses are either less robust or more 
fuzzy. 

We conclude that the number of groups is at least four, and 
probably seven or eight. In the following, we consider the cladis- 
tic result with eight groups because it provides the very impor- 
tant evolutionary relationships between them. 

Supplementary figures are given in Appendix IClfor the clus- 
ter partitioning and can be used to check that our interpretation 
does not depend on the detailed boundaries of the groups. In ad- 
dition, two complementary cladistic analyses were performed in 
order to check the influences of two different determinations of 



the distance of galaxies (needed to determine r c ), and of mea- 
surement errors are presented in Appendix |5] 

5. Discriminant descriptors of galaxies 

Among the initial 23 quantitative parameters (Sect. 12. U . only 6 
are discriminant and actually yield a relatively robust partition- 
ing (Sect. l3.4l l. The 17 remaining parameters do not yield enough 
information to distinguish different classes of objects, because 
either intrinsically they are not informative, they bear the same 
redundant information as the discriminant ones, or they are not 
discriminant for the sample under study. 

It is remarkable that the global luminosity of the galaxies 
{M a hs or K a b s ) is not discriminant. It is usually used as an in- 
dicator of mass and chosen as a main criterion of a priori clas- 
sification. Luminosity is also often assumed to characterize the 
level of evolution (for instance in the so-called "downsizing ef- 
fect"). However, from a diversification point of view, the ab- 
sence of the global luminosity is expected, since mass is a global 
property that can be acquired by different processes, i.e. accre- 
tion or merging, which have different timescales and perturbing 
powers. Such parameters, which show too much convergence, 
are not well-suited to establish phylogenies, that is, they are 
not good tracers of the assembly history of galaxies. Mass is 
bound to increase, it is thus not specific to any particular as- 
sembly history, which could distinguish different kinds of galax- 
ies. Nevertheless, mass is not entirely absent and is represented 
somehow in log cr and Bri e , which are certainly better tracers 
than mass itself of the way in which mass has been assembled. 

The index Olll, which tends to decrease in more metallic 
galaxies, is a discriminant parameter, but HB, which is often used 
as an age indicator, is not. This is not so surprising since age is 
not an indicato r of diversity, as it is sh ared by all objects (see a 
discussion in lFraix-Burnet et al.. 2009). Age, even more so than 
either mass or size, is bound to increase independently of the 
assembly history. Anyhow, defining an age for a galaxy is tricky 
and is often taken as the average age of the stellar populations, 
which is a poor tracer of the assembly history. 

The size parameters log r e and log(diam) are not discrimi- 
nant. They are probably merely scaling factors that are somewhat 
similar to mass, and bound to increase regardless of the sequence 
of transforming events that occur during the assembly history of 
galaxies. However size does not seem to be represented at all, 
and if so probably weakly in log cr and Bri e , or even in some 
hidden correlation, which we study lat er on in this paper. 

However, one may wonder why [Fraix-Burnet et all (1201 Oh 
found a robust partitioning using only four parameters. Two of 
them are in our list of six (log cr and Bri e ), one of them (Mg2) 
is very similar to [MgbFe]', but the fourth one is log r e , which 
is not a discriminant parameter in the present analysis. There are 
several reas ons for this. 

First, in lFraix-Burnet et ail (120101) . the four parameters were 
not the result of a multivariate and objective selection, but were 
chosen because common wisdom suggests that they may be im- 
portant for characterizing the physics of galaxies. The very posi- 
tive result obtained with these four parameters strongly supports 
this a priori, but the present paper demonstrates that only three 
of them are really discriminant parameters. 

Second, three parameters out of four are discriminant, so that 
the partitioning signal is borne by these three. Unless the fourth 
parameter (log r e ) is strongly erratic or contradictory, this signal 
is not expected to be entirely destroyed (see Sect. 17.4. U . 

Third, the discriminant parameters may in principle be dif- 
ferent from one sample to another, if the diversity of objects is 
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not equally covered. They may also depend on the initial set of 
parameters, if more discriminant ones are present in a larger list. 
This is probably the case for log r e , which has been replaced by 
better observables. 

It is thus unsurpri sing that the four parameters used in 
iFraix-Burnet et all (1201 Ol) yielded a robust partitioning and that 
we find more discriminant parameters in the present study. The 
six parameters selected in the present analyses are not neces- 
sarily the most suitable ones for other samples, for which new 
partitioning analyses should ideally be conducted. 



6. Group properties 

The groups identified by the partitioning methods must be under- 
stood in the light of their statistical and comparative properties. 
In this section, we first identify the main trends along diversifi- 
cation and then describe the distinctive properties of the groups. 

For this purpose, we use boxplots, which give the four quan- 
tiles of each parameter for the eight groups. We consider two ad- 
ditional parameters. The dynamical mass is defined as Md yn — 
Acr 2 R e /G with A = 3.8 (with Md y „ in solar mass, <x in km 
s" 1 and R e in kpc) accordin g to iHopkins et al.l (120081) . as done 
in lFraix-Burnetetal.l (l2010t) . This makes M dyn 5.95 * cr 2 R e . 
Using this mass, we compute the mass-to-light ratio M/L r using 
Bri e = -2.5 \og(L r lnrl) + 4.29. 

We show the most informative boxplots in Fig. HJ the oth- 
ers do not indicate significant differences between groups. We 
show the boxplots for the cluster partitioning in Fig. IC.ll of 
AppendixICl 

Figure |4] shows that logo - , logr e , NaD, [MgbFe]', Mgb, 
\og(diam), Fe5270, Fe5335, Mgb/Fe, M dyn , and M/L r essen- 
tially increase along the diversification rank defined on the tree 
of Fig. [3] while HB, M fl /„, K a /, s , and possibly D/B decrease. As 
already mentioned (Sect. 14. 4t . this rank is not necessarily as lin- 
ear as it seems. Anyhow, the adopted rooting of the tree gives 
a very sensible result: globally, galaxies tend to become more 
metallic, more luminous, more massive, and larger with increas- 
ing diversification. At the same time, they acquire a larger central 
velocity dispersion, which is often related to the higher mass, 
and NaD is also known to increase with both mass and veloc- 
ity dispersion. In addition, the decrease in HB indicates that the 
average age of the stellar populations increases with diversifica- 
tion. 

Mgb/Fe increases with diversification. The index [a/Fe] is 
known to increase with galaxy mass and age, because succes- 
sive mergers and accretions trigger more intense star formation 
on shorter timescales. These events clearly participate in the di- 
versification of galaxies, confirming our observed increase in 
Mgb/Fe. 

The Olll index does not show any trend with diversification, 
but has a lower median value for the three groups Clad3, Clad4, 
and Clad7. 

There is no systematic trend in environmental properties 
with diversification, n c having a wide range in all groups, ex- 
cept in Clad4 where it is small. Since an observed galaxy is the 
result of a long and multiple sequence of transforming events, it 
is probably the past environment, rather than the observed one, 
that plays a role in the diversification process. 

The most diversified groups (Clad5 to Clad8) have on aver- 
age a lower D/B ratio, suggesting that transforming events, such 
as accretion, interaction, and mergers, tend to destroy discs and 
build larger bulges, presumably by randomizing stellar orbits. 
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Fig. 4. Selection of the most interesting boxplots. 
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Fig. 5. Scatter plots showing evolutionary correlations. Colours 
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The morphological type is unevenly distributed among 
groups, Clad2 and Clad 4 having nearly only galaxies with 
T = -2, whereas T = — 5 galaxies are found mainly in Cladl 
and Clad8 (respectively, the most ancestral and one of the most 
diversified groups). 

Apart from the general trends with diversification, the groups 
have distinctive properties, otherwise there would be no reason 
for finding separate groups. Their distinctive properties are the 
following, where the number of members is given in parenthe- 



- Clad 1 (20 objects): these galaxies have the properties ex- 
pected from an ancestral group in being small, faint, discy, 
and of low metallicity. They are young (although have a large 
spread in HB) and have a large spread in morphological type. 
They have very low Md yn and log cr, and a low M/L r . 

- Clad2 (45): these galaxies have the same average HB, Bri e , 
and OIII as Cladl. They are larger, brighter, more massive, 
more metallic, and have a much higher log cr than both Cladl 
and Clad3. They are all of morphological type T = -2, and 
have the highest D/B of all groups by far. 

- Clad3 (16): these galaxies have a large range in many pa- 
rameters (HB, M a i, s and K a i, s , \og(diam), n c , OIII, Mg b/Fe), 
but not in [MgbFe]', log r e , log cr, Bri e , NaD, or Md yn . They 
have a small \og(diam) similar to Cladl, but a much higher 
logr e . They are relatively faint with a relatively low log cr 
and OIII, and a high Bri e . 

- Clad4 (13): these galaxies resemble those of Clad2 in most 
respects (see discussion below on the respective placements 
of Clad2 and Clad3). In particular, they are all of morpho- 
logical type T = —2, The main differences are that Clad4 
objects are large (the highest log r e after Clad7), of low sur- 
face brightness (high Brie) , have higher Mj yn and M/L r , and 
are slightly less discy. 

- Clad5 (30): these galaxies are very similar to the Clad4 ones, 
except that they have a low Bri e and a very low D/B, the 
lowest of all groups with Clad8. 

- Clad6 (85): this is one of the three largest groups, which are 
also the most diversified. Its galaxies have unexpectedly low 
values of Mj yn , log r e , and Bri e . Interestingly, they have very 
similar properties to those of Clad2 galaxies, except for a 
much lower D/B, slightly lower Bri e , and HB, and a slighty 
higher Mgb/Fe. 

- Clad7 (94): these galaxies are the largest in this sample. They 
are the most luminous and have the highest log cr, Md yn , 
and M/L r together with Clad8. Clad7 galaxies have a higher 
Bri e , slightly lower HB and OIII, and a slightly higher D/B 
than Clad6 and Clad8. 

- Clad8 (106): their distinctive properties are a very low D/B 
and a very large spread in morphological type, in a similar 
way to Cladl. They have a higher log cr and a lower Bri e 
than galaxies of Clad7. They have values of Md yn and M/L r 
that are as high as Clad7. 

The Clad2 group often departs from the general trend along 
diversification (Fig.|4), which would seem smoother if Clad2 and 
Clad3 were inverted. We have noted (Sect. [43) that Clad2 is split 
between Clus2, Clus3, and also Clus4. It is significantly higher 
than expected in log cr, D/B, NaD, [MgbFe]', Mg b, \og{diam), 
Fe521Q, Fe5335, and lower in M a b s and K a b S . This means that, 
because of some parameters, it seems misplaced in the diversifi- 
cation scenario for other parameters. This behaviour is visible in 
the partitioning from the cluster analysis (Fig. IC.ll since Clus2 
and Clad2 are partially similar. Hence, why was Clad2 placed so 
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early by the cladistic analysis, while it is more diversified in four 
of the six parameters used for the analysis? 

The diversification scenario given in the tree of Fig. [3]is ob- 
tained from the parsimony criterion, which chooses the simplest 
combined evolution of all parameters. Taken individually, the 
simplest evolutionary curve of each variable is monotonic with 
as few reversals as possible. For instance, in the log <x boxplot of 
Fig.|4j one would expect Clad2 and Clad3 to be inverted to avoid 
the Clad2 box to "peak". However, this is a multivariate com- 
promise, and since Clad2 would be better placed in the very first 
position on the D/B plot, to "smooth" the evolutionary curve, it 
is understandable that this is the most parsimonious placement 
on the tree. In addition, while the two discriminant parameters 
Bri e and OIII show a variable behaviour, they would neverthe- 
less induce us to place Clad2 before Clad3. 

The conclusion is that Clad2 is correctly placed in second 
position, because this is a multivariate analysis, which seeks a 
compromise among several parameters. This shows the impor- 
tance of selecting the parameters objectively, with multivariate 
tools. Otherwise, with too many redundant parameters, the pe- 
culiar properties of Clad2 could have easily been lost. 

The relative and distinctive properties of the galaxies from 
the different groups obviously cannot be summarized with only 
one or two physical parameters. The relative properties of the 
groups show that the evolution of galaxies is not linear. The 
global trend in some properties (such as mass, metallicity, or HB) 
may appear to be roughly linear globally, but a detailed analy- 
sis, and especially the distinctive properties within each group, 
give many clues to understand the assembly history of the corre- 
sponding galaxies. 

Having highlighted the group properties, we examine the 
possible correlations between them in the next two sections. 



7. Scatter plots and correlations 

Scatter plots must be examined using the partitioning to look for 
different behaviours between groups or within groups. 

In the first case, the distribution of the groups traces the 
projected evolutionary track given by the tr ee. The fundamen- 
tal pl ane is one example (Sect. I7.4. II and iFraix-Burnet et all 
l2010h . We however focus on cases showing a roughly linear 
track, where the groups are approximatively ordered along a lin- 
ear correlation. We refer to these relations as evolutionary cor- 
relations (Sect. I7.U since these groups are related in cladistics 
by evolutionary relationships. They are important since they im- 
ply that the observ ed relation can be genera ted mainly by evo- 
lution, as found in IFraix-Burnet et akl (I2010t) and formalised in 
lFraix-Burne1(l2011b . 

In the second case (Sect. 17.31 ). some correlation may or may 
not be present within a given group, independently of the global 
behaviour between groups. 

7. 1 . Evolutionary correlations 

We confirm the evolutionary nature of th e Mg2 - logo - corre- 
lation found by IFraix-Burnet et all (1201 Oh and identify several 
other cases, the clearest ones being shown in Fig. Such evo- 
lutionary correlations are revealed by the succession of groups 
ordered along the correlation with the most ancestral group 
(Cladl) at one end and the most diversified ones (Clad7 and 
Clad8 here) at the other end. 

Several of these evolutionary correlations involve the follow- 
ing set of parameters: logcr, logr c , M a \, s (and K a t, s ), HB, Mgb 
(and Mgl, Mg 2 , [MgbFe]', and Mgb/Fe), NaD, and \og{diam). 
Some relations are particularly tight (such as \og(diam) vs K a b s or 
Mg b vs [MgbFe]'). The iron Lick indices Fe5270, Fe5335, and 
Fe54Q6, as well as H-K, also follow an evolutionary correlation 
with M a i, s , although it is quite loose. 
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In all cases, except for \og(diam) vs K a b s and Mgb vs 
[MgbFe]' (which are discussed in Sect. I7.3l l. the correlation is 
not present within each group. This is a clear sign that there is no 
direct causal physical link between the two variables, but simply 
a c hange on average wit h galaxy diversification. 

iThomas et alJ (|2005) discuss the origin of the Mg b vs log <x 
correlation. They find that metallicity, not age, is the main driv- 
ing factor. This would again justify the use of metallicity as a 
reasonable tracer of diversification. However, we find instead 
that the Mg b vs log <x correlation is an evolutionary correlation, 
implying that diversification is indeed the real driver: metallicity, 
like central velocity dispersion, is bound to change on average as 
the galaxies evolve. This could explain why investigations find 
that this correlation appears so sensitive to several parameters: 
it has been proposed to be driven by metallicity , age, and rela- 
tive a bundance of different heavy elements (see Mat kovic et al.L 
2009, for references). This sensitivity more probably points to 
an underlying, hidde n, and confounding f actor, which creates the 
apparent correlation (iFraix-Burnetl |201 lb . 

The correlations between Mg b and [MgbFe] ' and between 
NaD and [MgbFe]' are clearly evolutionary, with diversifica- 
tion incre asing from left to rig ht in Fig. [5] while the correlations 
found by Thomas et alJ (I201 ll) are driven by total metallicity, 
which, for a given age and light-element ratio, increases from 
left to right in their Figs 6 and 8. The dispersion in the NaD vs 
[MgbFe]' relation is larger than that in the Mg b vs [MgbFe]' re- 
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lation for our data and theirs, and not well-accounted for by their 
model, presumably because of the fixed age and light-element 
ratio assumptions. Nevertheless, there is agreement between our 
result and their model since the average metallicity of galaxies 
obviously increases with diversification. 

The [MgbFe]' vs logo - , Mgb/Fe vs logo - , and Mgb vs 
log cr correlations (Fig. [5} can be co mpared with the ZJH v s 
log cr and a/H vs log cr in Figure 16 in Kuntschn er et a I] (120101) . 
The same corr elation is present, but w e clearly show its evolu- 
tionary nature. Kuntschn er et alJ (1201 Ol) ask the question: "What 
drives the [a/Fe] - log cr e (or mass) relation?". Our answer is 
simply: diversification. They indeed arrive at the same conclu- 
sion because they find that, in their sample, "there is evidence 
that the young stars with more solar-like [a/Fe ] ratios, created 
in fast-rotating disc-like components in low- and intermediate- 
mass galaxies, reduce the global [a/Fe] and thus significantly 
contribute to the apparent [a/Fe] - log cr e relation". These 
galaxies belong to our Cladl group as stated above, but they are 
not the sole responsible cause of this "apparent" relation, since 
all our groups are aligned along the same trend. 

The Faber-Jackson relation, M„b s vs log cr (Fig.fSJ, also ap- 
pears to be a purely evolutionary correlation: the sequence of 
evolutionary groups are aligned along this correlation. If mass 
had been the hidden parameter, then a similar correlation should 
exist within each group. This is not the case. This result was cor - 
roborated in an independent way by Ni goche-Netro et aD(l201 ll) . 



7.2. Diversification or ageing? 

There is a well-known degeneracy between the age (measured by 
H/3) and the metallicity (measured by [MgbFe]' or(Q.69*Mgb+ 
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Fe5015)/2) of stellar populations, w hich models of stellar e vo- 
lution hav e tried to circumvent (e. g. iTripicco & Belli |1995|) . In 
particular. lThomas et al.l(l201 ll) and Kuntschner et al. (2010) su- 
perimposed evolutionary tracks from different models on galaxy 
observations of H/3 as a function of a metallicity indicator de- 
fined by (0.69*Mgb+Fe50l5)/2. We reproduce the same figures 
in Fig.|6](left), showing with crosses the median for each group 
and with arrows the principal direction of increases in age and 
m etallicity. We emphas ize t hat the stellar evolution m odels used 
bvlTnomas et aT] d20 1 lb and lKuntschner et"al] d20 10) to produce 
their figures are single population models with fixed solar value 
of [o-/Fe], while the data used in this paper (Sect. |2TTT > are inte- 
grated over the whole galaxy, mixing together the contributions 
of possibly several different stellar populations. 

Our groups are clearly arranged according to diversification 
following an increase in both age and metallicity. The spread of 
the correlation is large, and, within each individual group, the 
range in age and metallicity is also large. This dispersion could 
certainly be explained by many factors, such as an extended 
horizontal bran ch, which can increase Hff (e.g.lC^gdol ll997t 
Matk ovic et all 120091) . In all cases, the median for each group is 
nearly perfectly aligned between the two axes for age and metal- 
licity, ordered as in Fig. [3] except for Clad2 and Clad7, which 
depart from the main alignment. This kind of plot indeed merely 
tells us that the age, metallicity, and H/3 of galaxies evolve on 
average with time. 

However, we find no correlation between age and metal- 
licity within individual groups. This is clearly shown in Fig. [6] 
(right), which plots H/3 vs [MgbFe]', which is a better indicator 
of metallicity than Mgb. It is striking that the elongated iner- 
tial ellipses for Clad3, Clad5, and Clad7 are well-aligned with 



the H/3 axis, with relatively little spread in metallicity, the one 
for Cladl is slightly inclined, and the one for Clad2 is aligned 
along the global trend. Since the other ellipses are quite round, 
Cladl and Clad2 are the sole groups that might show a barely 
significant correlation between H/3 and a metallicity indicator. 

The wider range in H/3 for the less diversified groups, espe- 
cially Cladl and Clad3, clearly appears in Fig. [6] and can also 
be seen in Fig. [4] It probably corresponds to the well-known 
wider range in age of the low-mass objects (e.g. Mat kovic et all 
2009). Our interpretation is not that the low-mass galaxies have 
had a longer star formation history: we propose instead that 
these objects formed or appeared over a longer timescale in the 
Universe's history and the older ones have not changed much, 
apart from the ageing of stars. Larger galaxies on average, neces- 
sarily took more time to assemble and complexify (diversify), so 
that it is very unlikely to find young and very diversified galax- 
ies. However, the notion of galaxy age must be questioned. 

Figure [6] illustrates the fundamental difference between di- 
versification and age. The Cladl group, which is assumed to 
be ancestral because of its low metallicity, appears to be the 
youngest group on average according to stellar evolution mod- 
els. If a galaxy is to resemble the most pristine objects, it must 
not have been transformed too much even by secular evolution. 
This is why the most pristine objects are necessarily relatively 
young and metal-poor. Conversely, the most diversified objects 
have a higher average stellar age, which provides no informa- 
tion on the epoch of the transforming events that gave them their 
observed properties. As a consequence, old galaxies are not ob- 
vious ancestors. In addition, the spread in age within each group 
is generally large and overlaps with the spread in age of the other 
groups. Age is thus not a good landmark of evolution. From the 
point of view of astrocladistics, the so-called downsizing effect 
results from a confusion between a ge and the level of diversifi- 
cation (IFraix-Burnet et all l2006bO l2009h . 

Hence, the age of a galaxy or a group of gala xies is probably 
not so important and may even be meaningless ( Serra & Trager , 
2007). We even find this term misleading, and suggest it be re- 
placed by "average stellar age". The diversification state should 
be used instead, as it reflects the actual assembly history of a 
galaxy. 

7.3. Correlations within groups ("specific correlations") 

As previously seen (Sect. 17.11 and Fig. 0, the two scatter 
plots of \og(diam) vs K a t, s and Mgb vs [MgbFe]' show both 
a global evolutionary correlation and correlations within the 
groups (which we call "specific correlations"). Four other scatter 
plots only show specific correlations, the global correlation be- 
ing less obvious and/or more dispersed : Bri e and K„b s vs log r e , 
log(M,/ v „) vs log r e , and [MgbFe]' vs Fe521Q. These six specific 
correlations are shown in Fig. Q 

Diversification within each group is determined by the struc- 
ture of the tree in Fig. [3] If we examine the evolution of the 
parameters involved in the correlations along each branch (thus 
each group) of the tree, we find that: 

- K a i, s increases slightly with diversification within Clad6 and 
Clad8; 

- [MgbFe]' might possibly increase in Clad8; 

- log r e might possibly decrease in Clad6; 

- Mdy n decreases slightly in Clad6 and might possibly increase 
in Clad5 and Clad8. 

This is clearly not enough to explain all the observed specific 
correlations with evolution within the groups. However, the dif- 
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ference between objects of a same group is weaker than for the 
whole sample and thus would require refined cladistic analyses 
with possibly additional descriptors. We now examine in some 
detail the scatter plots in Fig. [7] 

The correlation, which is particularly tight and linear, be- 
tween Mgb and [MgbFe]', also holds within each group. Since 
the first parameter largely depends on the g-elements, while th e 
second is essentially independent of it (iThomas et all 1201 ll) . 
these specific correlations can probably be explained, in a simi- 
lar way to the global one, by evolution. The correlations between 
[MgbFe]' and either Fe521Q or Fe5335 have larger scatters. 

The correlation between \og{diam) and K a b s seems to be 
present within each group. This is not proof however that it is 
a causal relation because the larger the galaxy the more lumi- 
nous, it still be due to evolution withi n each grou p or to some 
other confounding parameter dFraix-Burnetl 1201 ll) . In addition, 
all correlations, either global or specific, are approximately sim- 
ilar, which suggests that they have the same explanation. 

The Kormendy relation (Bri e vs log r e ) clearly appears to de- 
pend on the group, and has a far smaller scatter for the most 
diversified groups. There is an "evolution" in the correlation 
curve following diversification, the different relations appearing 
stacked on each other. At first glance, galaxies are brighter when 
more diversified, but this is not so simple if we look at Clad7 
and Clad8: galaxies from the first group are globally larger and 
fainter. The correlation also has a larger scatter for Cladl and 
Clad3. 

The K a b s vs log r e relation is quite dispersed, but there are 
slightly more convincing correlations for some groups, at least 
for the most diversified ones. For Cladl, there is little variation in 
log r e , so that there is no real correlation. The difference between 
this relation and the K a b s vs \og{diam) one is striking. 

The Md vn vs log r e relation is tight, with very clear corre- 
lations specific to each group and with relatively little over- 
lap between them. This p lot can be usefu ll y co mpared to nu- 
merical simulations (e.g . iRobertson et all 120061) as done in 
iFraix-Burnet et all (l2010t see Sect.|7A]]and Sect.|8). 

The [MgbFe]' vs Fe521Q relation is quite dispersed, despite 
the square-root relation linking both parameters. Correlations 
can be easily seen within groups, and they appear to generally 
differ (particularly in terms of the slope) from the global rela- 
tion. 

To summarize, there are several cases where the specific cor- 
relations are present in all or some of the groups, regardless of 
whether the global correlation exists. Why is this so? 

If the correlation is present both globally and within groups, 
then we can guess that it is for the same reason. In the two cases 
here (\og{diam) vs K a b s and Mgb vs [MgbFe]'), the global cor- 
relation is evolutionary (Sect. 17. U and since all correlations ap- 
pear to have approximately the same slope, then the specific cor- 
relations should be evolutionary as well. 

In the other cases where there is no obvious global corre- 
lation, the reason must be specific to the group, and probably 
different from one group to another. The correlations might be 
explained by a direct physical cause, or by a confounding pa- 
rameter, which can still be evolution. Note that the confounding 
factor may depend on the group. 

Anyhow, the origin of the correlations and their properties 
is quite complex. In the case of the M^ yn vs log r e relation, nu- 
merical simulations show that it is determined by several vari- 
ables involved in the assembly history, such as the epoch of the 
last merger, the level of dissipation, the numb er of accretion 
event s, the impact parameters, and so forth (IRobertson et al.L 
2006). Thanks to a good diversity of simulated galaxy popula- 



tions, IFraix-Burnet et al 1 (120101) were able to derive the history 
assembly of each group. The specific correlations can then be 
explained by either several drivers from the physical point of 
view, "cosmic variance" within each group from the observa- 
tional point of view, or confounding factors from a statistical 
point of view. 

Consequently, the two scatter plots showing both global 
and specific correlations are probably driven by a dominant 
general evolutionary factor (such as perhaps dynamical evolu- 
tion for \og{diam) vs K a b s and chemical evolution for Mg b vs 
[MgbFe]') affecting all galaxies of the sample, while the other 
ones have multiple and necessarily specific factors, as in the 
Mdyn vs log r e relation. For instance, the importance of merger 
events applies only to these galaxies that have experienced such 
a catastrophic transforming process during their assembly his- 
tory. 

7.4. Fundamental planes 

7.4.1. The fundamental plane of early-type galaxies 

The well-known and intensively studied correlation between 
Bri e , log cr, and log r e is called the fundamental plane. The first 
multi variate analysis of this relation was performed recently 
bv IFraix-Burnet et al.l (l2010l) . The present sample and that of 
IFraix-Burnet et al] (|2010|) . which are both at low redshift, have 
no galaxy in common and the parameters used to partition the 
sample into groups are different. 

The p resent partitioning is in excellent agreement with the 
result of Fraix-Burnet et al as illustrated in Fig. [8] which 

shows the projection onto the fundamental plane (logo - vs 
Bri e ) of the partitionings obtained by cladistics and cluster 
analysis in the present p aper, and the partitioning obtained by 
IFraix-Burnet et aljjf 2010). T he structures w ithin the fundamen- 
tal plane found in Fraix-Burnet et al. (2010) are thus confirmed. 

There is a good correspondence between the groups in the 
two studies, as can be seen in Fig. [8] and in more detail in 
Fig. |OJ CI includes Clad4 and a large part of Clad3, C4 
Clad6, C3, C5, and C6 are essentially included into Clad7, and 
C7 =a Clad8. The logr e vs M dyn diagram (Fig. and Fig. IC4l 
confirms these equivalences, pointing out that CI overlaps both 
Clad3 and Clad4 and is distributed in a way that is more similar 
to Clad3. There are however some differences. 

Cladl seems to occupy a region of the fundamental plane 
(lo g <t vs Bri,,) that is not ve ry well-covered by the sample used 
bv lFraix-Burnet et al] (l2010b . 

Clad2 has no equivalent in IFraix-Burnet et al ] (120101) when 
projected onto the fundamental plane (Fig. [8j. This group is also 
plotted separately in Fig.|9]to show that it follows the same cor- 
relation as the other groups, spanning nearly the full range of 
both log r e and Md m . 

Clad5 is also absent in IFraix-Burnet et aL Since it 

appears in the very centre of both Fig. [8] and Fig. [9] we believe 
that it is identified here because of the larger number of parame- 
ters used here, which in effect provides a higher-resolution anal- 
ysis. Its properties were also found to be quite similar to Clad4 
(Sect. [6]), so that C lad4 is quite differe n t from CI (see above). 

Group C2 of IFraix-Burnet et al. (2010) is absent in the 
present partitioning. This is perhaps because the corresponding 
regions of the fundamental plane (Fig. [8]) are not very populated 
in our sample, or because of the different sets of parameters used 
in the two studies. 

To complete the comparison bet ween the two studies, w e 
performed the same analysis as in IFraix-Burnet et al.l (120101) . 
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and \og(diam) (Fig.fTUb. The two latter parameters are obviously 
reminiscent of Bri e and log r e , while HB is of a different nature 
from log cr, so that this fundamental plane is not redundant with 
the classical one. Replacing HB by D/B, one also obtains a nice 
correlation, thus another fundamental plane. 

It is interesting to note that the parameters of these fun- 
damental planes might be easier to observe than for the clas- 
sical one, but still allow the distance determination thanks to 
\og(diam). 

There may be additional similar surfaces in higher- 
dimension parameter spaces but that have more complex pro- 
jections. Would these be interesting to discover and would they 
be more useful than the classical fundamental plane? 

The answer to these questions is twofold. First, they are iden- 
tical to the classical fundamental plane, in the sense that they 
are essentially evolu tionary correlations driven by diversification 
dFraix-Burnett 1201 lb . All are simply projections of the tree in a 
sub-parameter space. They are thus not more and not less infor- 
mative. Second, they can be useful once the confounding factor 
(here evolution) has been taken into account. This is however 
beyond the scope of the present paper. 



Fig. 10. Another "fundamental plane" in the parameter space de- 
fined by HB, K a b s , and \og{diam). 



using the same four parameters log cr, log r e , Bri e , and Mg2 
(Appendix [B]). Naturally, since these four parameters are not all 
discriminant for the present sample, the resulting tree and the 
corresponding partitioning are slightly less robust. However, the 
agree ment is still quite go od. The sample used in the present 
paper (Ogando et al.L 20081) is globally a t a lower redsh i ft tha n 
the one used in Fraix-B urnet et al. 1 (120101) dHudson et al.L 12001b . 
This renders the d etermination of the d istance, and thus log r e 
less accurate (see lOgando et al.L 120081) . This could partly ex- 
plain why log r e has not been found to be discriminant or why 
the four-parameter resul t here is slightly less robust than in 

iFraix-Burnet et all (l2010t) . 

We thus confirm the result of IFraix-Burnet et aD (1201 Ob that 
there are structures within the fundamental plane. Since the or- 
ganisation of the groups on this plane defines very similar evolu- 
tionary paths corresponding to a clear trend in diversification as 
defined by Fig. [3] we confirm that this global relation in three- 
parameter space is mainly driven by diversification. For the sake 
of clarity, we stress that this evolutionary interpretation of the 
fundamental plane holds for the groups, not within the groups, 
since neither the present paper nor that by Fraix-Burnet et al] 
(2010) tackle this questi on. This would certainly m erit further 
specific studies because IFraix-Burnet et all (1201 Ol) found that 
the tightness of the fundamental plane strongly depends on the 
group considered, the least diversified ones showing a very loose 
- if significant at all - correlation. 



7.4.2. Other correlation planes 

Concerning the fundamental plane, the groups follow one an- 
other along a path of diversification in the log r e vs log cr (edge- 
on projection) scatter plot, whereas they are clearly distinguish- 
able and distributed in no obvious order in the Bri e \s log r e 
(face-on projection) scatter plot . 

Hence, one can expect to find other "fundamental planes" 
by looking at the behaviour of groups in two scatter plots made 
with a set of three parameters. This is the case for HB, K a b s , 



8. The assembly history of early-type galaxies 

IFraix-Burnet et all (1201 Ot) uncovered the assembly history of a 
completely different sample of early-type galaxies by analysing 
its properties in both the fundamental plane and a mass-radius 
diagram with the help of numerical simulations from the liter- 
ature. This sample was composed of galaxies in clusters, while 
the present one is composed of galaxies in the field, groups, or 
clusters, and has the advantage of more properties being docu- 
mented. Emission-line galaxies were excluded from both sam- 
ples. We repeat here the exercise for both sets of groups merged 
together following the correspondence detailed in Sect. 17.4.11 

- Cladl: these galaxies have low metallicity and in many re- 
spects look quite primitive, having a very low dynamical 
mass. They are less metallic and more primitive than the 
Clad3/C l ones, which were chos en as the most primeval 
group in lFraix-B urnet et al. (2010). Cladl galaxies could be 
the remains of a simple assembly through a monolithic col- 
lapse with little dissipation and their somewhat discy nature 
proba bly requires si gnificant feedback and a few perturba- 
tions (lBensonll2010l) . 

- Clad2: this group shows a steep correlation between log r e 
and Mdvn - This could be indicative of some merger processes 
(e.g. lCiotti et all [2007b but the galaxies are discy. However, 
they also have a high log cr. This suggests that these galaxies 
are quite primitive objects similar to those of Cladl but since 
they are more massive, they underwent a more significant 
secular evolution, perhaps as in the case of " pseudo-bulges" 
(iKormendv & KennicuttLl2004l:lBensod.l2010h . 

- Clad3/Cl: chosen a s the most primeval group because of its 
low average Mg2 in IFraix-Burnet et all ( 20 1 Ol) . this group is 
here surpassed by the still lower Mg2 of Cladl. They found 
that galaxies of CI are possibly the remains of a simple as- 
sembly through a monolithic collapse with little dissipation, 
and were probably perturbed by interactions. We propose in- 
stead that accretion is the main perturbations because the 
Clad3 galaxies are small, not very much concentrated, and 
have a low log cr. 

- C2: they were found in Fraix- Burnet et al 1 d2oToh to be less 
massive and smaller than the ones of Clad3/Cl, and have a 
slightly higher Mg2- They are also somewhat brighter. They 
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Fig. 11. Combination of the tree from this paper (Fig. [3) and the one in iFraix-Burnet et al.l (1201 Ol) . showing the proposed assembly 
history. Numbers indicated at nodes are referred to in the text. 



could be the remains of wind stripping of some kind of more 
diversified objects because of a strong interaction. 

- Clad4: this group is very similar to Clad2, but since the 
Clad4 galaxies are larger and have a higher Md yn , they might 
have been perturbed by a strong interaction that yielded a 
more massive central black hole. 

- Clad5: being very similar to Clad4 objects, they could be 
these discy galaxies seen edge-on but this is statistically un- 
tenable because there are three times more Clad5 objects 
than Clad4 ones. Since they have a very low D/B, a more 
likely explanation is that Clad5 galaxies could be perturbed 
versions of Clad4 members. These perturbations are prob- 
ably mergers since these objects have lost the disciness of 
Clad4 objects, but to preserve similar properties, little gas 
should be involved, which implies that dry mergers are the 
most probable transforming events. 

- Clad6/C4: three scenari os were proposed for C4 in 
IFraix-Burnet et al] (f2010): these objects could simply be ei- 
ther galaxies in which star formation has been continuous, 
CI galaxies in which the initially richer gas has not been 
swept out, or the remnants of several minor mergers and ac- 
cretion. The Clad6/C4 galaxies have unexpectedly low val- 
ues of Mdyn, Bri e , and logr e , as well as to a lesser extent 
log(diam). Otherwise, they do not look odd, so that a con- 
tinuous star formation with few external perturbations could 
also be a reasonable explanation. However, we find that they 
are very similar to Clad2, which we proposed to have un- 
dergone significant secular evolution, but have a much lower 
D/B. This suggests that many interactions, such as harass- 
ment dMoore et all 1 1996h . could be the culprit. 



- Clad7: because Clad7 includes the groups C3, C5, 
and C6, their hi st ory might be complex according to 
IFraix-Burnet et al.l (1201 Ol) . involving many transforming 
events (accretions, minor mergers, together with more or le ss 
dissipational major mergers). They are probably the remains 
of both wet and dry mergers, the most recent ones being of 
the latter kind. The low value of HB, that would indicate that 
the last star formation event is relatively ancient, reinforces 
this interpretation. They could represent a kind of end state 
of galaxy diversification. 

- Clad8/C7: C7 galaxies were found to be small and very 
metallic with a high surface brightness, and to define a tight 
FP. They seemed to be associated with the remains of a dissi- 
pative (wet) merger, with very little or no dry mergers. They 
might also have formed through minor mergers and accre- 
tions but the tight FP favors the dissipative wet-merger sce- 
nario. The low D/B found here for Clad8/C7 tends to con- 
firm this conclusion. We believe that they may well define 
another possible end state for galaxy diversification. 

We summarize the above histories in a single cladogram 
(Fig. Q~T]i, combining the trees obtained in the two studies. 

The best way to interpret the evolutionary scenario depicted 
in this cladogram is to identify, for each node of the tree, a 
particular transforming event that could characterize all groups 
related by branches and su b-branches starting from this node 
(Fraix-Burn et et all |2006c). The sequence of nodes downwards 
starting at the upper left of the tree thus defines a sequence of "in- 
novations" that occurred in a common ancestor and were trans- 
mitted to all its descendant species. In principle, these innova- 
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tions are the properties of the galaxies that remain as imprints 
transmitted through subsequent transforming processes. Here, 
we consider the transforming events as innovations since they 
are the origin of the modifications of the properties of galax- 
ies. However, it should be kept in mind that parallel evolutions 
(events occurring independently on two different lineages), con- 
vergences (different pathways leading to the same parameter 
value), and reversals (backward parameter evolution) are proba- 
bly present, making this exercise currently quite tentative. These 
behaviours (called homoplasies) are supposedly not too numer- 
ous here since the parcimony optimisation of the cladistic analy- 
sis minimizes the occurrence of these types of parameter evolu- 
tions. We do not discuss them in this very first attempt, especially 
because the properties at hand are few. 

We attempt to identify these innovations in Fig. QT| using 
the possible histories of each group previously identified. They 
appear on the tree when they first occur in the history of the 
Universe. It is thus expected that the most basic transforming 
events occur very early and the more complex ones later, mak- 
ing the sequence of "innovations" along the tree a representation 
of the so-called cosmic evolution. 

Finally, we remind the reader that if the level of diversity 
goes along the vertical axis in Fig.Qj] the horizontal axis, espe- 
cially the branch lengths, has no particular meaning in this rep- 
resentation. It is important to keep in mind that we are dealing 
with present-day galaxies and properties, not those that prevailed 
at the time of the transforming event indicated at the node from 
which a given branch emerges. Simply speaking, this means that 
the galaxies of, say, Cladl, are present-day galaxies that have 
passively evolved from a less diversified initial state than those 
of either Clad6 or Clad7. 

The nodes are identified by the numbers in Fig. Q~T] The cor- 
responding proposed events are: 

1. collapse; 

2. secular evolution; 

3. accretion; 

4. interaction; 

5. "gaseous" interaction; 

6. dry merger; 

7. harassment; 

8. wet mergers. 

The first transforming event (node 1), which marked the his- 
tory of all the galaxies, is most likely monolithic collapse. This 
is probably the simplest process to form a self-gravitating en- 
semble of gas, stars, and dust that we can call a "galaxy". The 
galaxies of Cladl evolved passively after exhausting their gas 
reservoir. 

The next three events (nodes 2-4) must have been gentle, 
since the discy morphology is well-preserved until Clad4 and 
it is now well-established that minor mergers generally preserve 
the structure of discy galaxies, while mergers of galaxies of com- 
parable masses generally do not. We first have (node 2) the sec- 
ular evolution, which is defined as the evolution of a galaxy in 
isolation, which is expected to be far more frequent and capable 
of significantly modifying the structure and properties of galax- 
ies. At node 3, we must then invoke accretion must be invoked to 
increase the masses of galaxies. A more complex event follows, 
namely interaction, which is an external perturbation that, with 
the wealth of possible impact parameters and galaxy properties, 
is probably the main driver of galaxy diversi ty, especially during 
the first Gyr of the Universe (iBensonl 1201 Oh . 



For node 5 between C2 and Clad4, interaction involving gas 
must be invoked to either strip the gas in C2 galaxies by ram 
pressure or feed the central black hole in Clad4 objects. 

Mergers must be advocated at node 6 since the more diversi- 
fied groups have lost their disciness. For Clad5 galaxies, the most 
probable transforming event is a major merger without much g as 
(dry mergers). 

A substantial star formation must have occurred in Clad6 
galaxies, and several properties indicate repeated perturba- 
tio ns, implying that harassment is a good candidate for node 
7 dMoore et al.l 119961) . Harassment is the cumulative effect of 
high-speed galaxy encounters, that heats the disc (logcr in- 
creases) and favors gas inflow to the galaxy center. This kind 
of transforming event acts on a longer timescale than the dry 
mergers at node 6, which explains why it appears "later" in the 
cladogram. 

The two most diversified groups, Clad7 and Clad8, are found 
to have had complex histories, certainly including wet mergers 
(node 8). 

Many associated processes, such as feedback and the 
quenc hing of star formation (e.g. iBundv et al.L 120061: IBensonl 
2010) are not proposed here because we have concentrated on 
more generic events. A significant difficulty of such an exercise 
is to identify some properties with transforming events that are 
very complex, involving diverse impact parameters and various 
chemical, physical, and dynamical processes. We believe that it 
is somewhat illusory to associate a particular feature with any 
each of these events, and only statistical analyses of simulated 
cases could provide average properties that could be compared 
to statistical analyses of real objects similar to the one we have 
performed here. 

9. Conclusion 

We have used several multivariate tools, first to select the 
most discriminant parameters from the 25 initiall y available for 
the sa mple of 424 fully documented galaxies of Ogando et al. 
(2008), and second to partition the sample into groups. The three 
partitioning methods yield similar numbers of groups and similar 
composition for each of them, considering that some fuzziness 
is expected. 

Our first result is that among the initial 23 quantitative pa- 
rameters available in this study, only 6 are discriminant and actu- 
ally yield a relatively robust partitioning for this sample. Among 
the 10 Lick indices, only 2 ([MgbFe]' and NaD) are discrimi- 
nant, together with log cr, Bri e , Olll, and D/B. 

The global evolutionary scenario found by astrocladistics 
gives a very sensible result: galaxies tend to globally become 
more metallic, more luminous (more massive), and larger with 
increasing diversification. At the same time, they acquire a larger 
central velocity dispersion, which is often related to the mass 
increase, and NaD also increases along with both mass and ve- 
locity dispersion, as expected. These are global statistical trends 
that are explained by general basic physical and chemical pro- 
cesses as a function of time since the Big Bang. 

As a consequence, the many properties of galaxies that are 
bound to evolve on average with galaxy diversification explain 
the several evolutionary correlations found in this paper. In par- 
ticular, we have confirmed the evolutionary nature o f the Mg2 vs 
log cr correlation found bv lFraix-Burnet et al.l(l2010b for a differ- 
ent sample and using a different set of parameters. Rather inter- 
estingly, we have also found some correlations that are specific 
only to some groups. These can be attributed to either a direct 
physical cause or a confounding factor specific to some groups 
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(such as the epoch of the last merger, the level of dissipation, 
the number of accretion events, the impact parameters, or the 
number of mergers). 

One of the most important results of our work is that the 
structures defined by the partitioning, when projected onto ei- 
ther the fundamental plane (log cr vs Bri e ) o r the log r e vs M^ v „ 
diagra m, are very similar to those found by iFraix-Burnet et al] 
(201CJ) for a totally distinct sample with different parameters. 

The fundamental plane of early-type galaxies appears to 
be very probably generated by diversification. In support of 
this, we have also found another "fundamental plane", a three- 
dimensional correlation between HB, K a b s , and log(diam). 
All scatter plots are basically simple projections onto a sub- 
parameter space of the partitioning established in the six- 
parameter space. Thus, there is less information in either these 
scatter plots or "fundamental planes" than in the multivariate 
partitioning and the evolutionary tree obtained with cladistics. 

Another important result is that six parameters - no fewer 
- are needed to describe the diversity of this sample. The three 
parameters of the fundamental plane (log cr, Bri e , and logr e ), 
plus the index Mg2, have not yielded as robust a partitioning 
here, although they did in a previous study for a distinct sample 
( IFraix-Burnet et all l2010b . We argue that there is no contradic- 
tion here, because three discriminant parameters extracted in the 
present paper are present in our previous study (log cr, Bri e , Mg2 
being replaced by [MgbFe]'). The multivariate analyses gener- 
ally depend on both the objects in the sample and their initial 
set of descriptors, both of which are different in the two stud- 
ies. Nevertheless, as a consequence, similar analyses will have 
to be conducted on other samples with other descriptors, since 
our fairly small sample of nearby galaxies cannot represent the 
diversity of galaxies throughout the Universe, and the available 
parameters are here restricted to the v isible domain. 

We have combined the results of Fr aix-Burnet et al.l (2010) 
and those in the present paper on a single cladogram, showing 
the possible assembly history for each group. From this clado- 
gram, we have attempted to identify the transforming events 
that are at the origin of galaxy diversification. The transform- 
ing events that we have indicated as "innovations" are tentative, 
because the information at hand is insufficient to identify them 
with certainty. These proposed events show that the use of so- 
phisticated statistical tools yields a very sensible classification. 
Figure QT| is the basis of an explanatory classification linking 
the objects to the fundamental transforming processes, i.e. to the 
physics, rather than a descriptive classification adopted in most 
current classifications of galaxies. In this respect, we note that 
the Edwin Hubble classification is of the latter type, being based 
on morphology, while his tuning fork diagram (often called the 
Hubble sequence nowadays) is explanatory since it indicates the 
links between the classes. Nearly one century later, we know that 
galaxies are characterized by much more than only their mor- 
phology, so that we need to generalize the Hubble diagram to a 
multivariate picture of galaxy diversification. Figure ??, which 
we have produced using cladistics, is one step in this direction. 

Hence, increasing both the sample size and the number of 
descriptors is an absolute requirement. The six-parameter space 
needed to describe the diversity of the sample used in the present 
paper is probably a minimum space because of the complexity 
of galaxies and their assembly history. The nature of the discrim- 
inant parameters might also change with the input of more ob- 
servables. In addition, the number of groups and their boundaries 
will certainly change. This is a double quest: classifying galaxies 
into objectively established evolutionary and intelligible groups, 
and finding the parameter space in which these groups can be 



identified. This quest is necessarily progressive, and will prob- 
ably never end. However, one can hope that some convergence 
will be reached. 

A limitation of the present work is that cladistics cannot be 
applied directly to very large samples as the necessary com- 
puter time would be prohibitively excessive. However, once the 
most discriminant parameters are identified, it will be possible 
to repeat the cladistic analysis for many subsamples, and sub- 
sequently combine the trees to define classes of galaxies. The 
ultimate goal is to gather the huge number of galaxies in the 
Universe into a tractable number of groups and establish the cor- 
responding evolutionary relationships. 
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Appendix A: Methods 

A. 1 . Principal component analysis 

Principal component analysis (PCA) is well-known to as- 
tronomers. It is not a partitioning method: its aim is instead to 
reduce the dimensionality of the parameter space. From the cor- 
relation matrix, PCA builds eigenvectors (the principal compo- 
nents) that are orthogonal and linear combinations of the phys- 
ical parameters. These eigenvectors usually have no physical 
meaning. In general, most of the variance of the sample can be 
represented with only a few principal components (those hav- 
ing an eigenvalue greater than 1). They thus give a simpler rep- 
resentation of the data by eliminating the correlations between 
physical parameters. Strongly correlated parameters are gath- 
ered in the same eigenvector, and the most important parameters 
(with respect to variance) are the ones with the highest coeffi- 
cient (loading) in each eigenvector. The physical interpretation 
must be made back in the real parameter space. 

PCA is thus very efficient at reducing the parameter space to 
supposedly uncorrelated components and helps in detecting the 
most discriminant or discriminating parameters. The number of 
significant eigenvectors gives an idea of the number of parame- 
ters necessary to describe the sample. Principal components can 
also be used for subsequent cluster or cladistic analyses. 

There is however a caveat to be kept in mind. PCA elimi- 
nates all correlations, regardless of whether they are causal. It is 
extremely useful to remove any redundancies, as well as physical 
correlations between two parameters indicating the same under- 
lying process. However, PCA also removes evolutionary corre- 
lations (which are called "spurious" or confounding in statistics, 



iFraix-Burnetl 1201 ll) . for instance between two parameters that 
are independent but vary wit h time. The log cr - Mgj. c orrelation 
for early-type galaxies (see iFraix-Burnet et al.l 12010) is a good 
example. Such independent evolutions are lost through the PCA 
reduction of dimensionality. 

A.2. Minimum contradiction analysis 

Partitioning objects consists in producing some order. In some 
cases, i.e. in either hierarchical clustering or cladistics, the ar- 
rangement of the objects can be represented on a tree. A tree is 
a graph representing the objects as the leaves with a unique path 
between any two vertices. A bifurcating tree has internal vertices 
that all have a degree of at most 3 (at most 3 branches connect to 
any such vertex). 

By indexing circularly all the leaves of a planar represen- 
tation of a weighted binary tree, one obtains a perfect order, 
meaning that the corresponding ordered distance-matrix fulfills 
all Kalmanson inequalities. Generally speaking, the Kalmanson 
inequalities are fulfilled if the ordered distance matrix corre- 
spond s to a we ighted binar y tree or a superposition of binary 
trees (Th uillard & Moult on, 2011). The difference between the 
perfect order and the order one obtains with a given dataset 
is called the contradiction. The minimum contradiction corre- 
sponds to the best order one can get. 

The minimum contradiction analysis (Thuillard, 2007, 2008, 
MCA,) finds this best order. It is a powerful tool for ascertaining 
whether the parameters can lead to a tree- like arrangement of the 
objects (Thuill ard & Fraix-Bu rnet. 2009). Using the parameters 
that fulfil this property, the method then performs an optimisa- 
tion of the order and provides groupings with an assessment of 
their robustness. 

For taxa indexed according to a circular order, the distance 
matrix , which is defined to be 

Yjj — ~{d^ n + dj tn 

fulfils the so-called Kalmanson inequalities (Kalmanson, 1975): 



nj^n^f^Ki (^7<£) 



(A.l) 



where djj is the pairwise distance between taxon i and j. The 
matrix element Y"j is the distance between a reference node n 
and the path i-j. The diagonal elements Y". = £/,,„ correspond to 
the pairwise distance between the reference node n and the taxon 
i. 

The contradiction on the order of the taxa can be defined as 

c = Z ( max (fe - y "j) ' °)) 2+ X ( max (fe - Y "*) ' °)f (A - 2) 

k>j>i k>j>i 

for any i, j, k + n. The best order of a distance matrix is, by def- 
initio n , the order minimizing the contra diction (iThuillardl [2007 , 
20081. iThuillard & Fraix-Burnetl (120091) showed that the perfect 
order is linked to the convexity of the variables in the param- 
eter space, and is obtained for specific properties of the vari- 
ables along the order. It is then possible to detect the discrimi- 
nant potentiality of the variables. This is exactly what is done in 
Sect.O 



A.3. Cladistic analysis 

Cladistics seeks to establish evolutionary relationships between 
objects. It is a non-parametric character-based phylogenetic 
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Table A.l. Loadings on the eight principal components of the PCA analysis made on the set of 23 parameters (see Sect. I3.lt . 





Compl 


Comp2 


Comp3 


Comp4 


Comp5 


Comp6 


Comp7 


Comp8 


Mgb 


-0.9143 


-0.122395 


0.1594 


-0.1593 


-0.1215 


0.04717 


0.11541 


0.03549 


logs 


-0.9067 


-0.012224 


0.0674 


0.0950 


-0.0443 


0.02837 


-0.05006 


0.04723 


Mg2 


-0.8983 


-0.158860 


0.1190 


-0.1121 


-0.1455 


0.05519 


0.07249 


-0.00744 


mgbfe 


-0.8879 


-0.303358 


0.0656 


-0.1870 


0.0543 


-0.04251 


0.00546 


-0.04366 


NaD 


-0.8659 


-0.099002 


0.0527 


0.0657 


-0.0496 


-0.00899 


-0.00360 


-0.03957 


Mgl 


-0.8470 


-0.132117 


0.1739 


-0.0977 


-0.1787 


0.01250 


0.06330 


-0.05674 


Kabs 


0.7780 


-0.357711 


0.1878 


-0.3577 


-0.1884 


0.00503 


0.04219 


-0.10201 


mabs 


0.7255 


-0.359334 


0.2095 


-0.4395 


-0.2075 


0.06904 


-0.06490 


-0.11826 


ldiam 


-0.7248 


0.409603 


-0.2557 


0.3327 


0.1887 


-0.09360 


0.02929 


0.06032 


mgb.fe 


-0.6742 


0.245147 


0.2938 


-0.0296 


-0.3977 


0.17776 


0.26559 


0.15522 


logre 


-0.6512 


0.570850 


-0.3297 


-0.2541 


-0.0686 


-0.01075 


-0.01336 


-0.01325 


Fe5335 


-0.5455 


-0.406391 


-0.1312 


-0.2574 


0.3127 


-0.11438 


-0.13735 


-0.09773 


hbeta 


0.5389 


-0.223947 


-0.4974 


0.2296 


0.0104 


0.18101 


-0.01882 


-0.00539 


Fe5406 


-0.4906 


-0.465277 


-0.1337 


-0.0838 


0.1044 


-0.13418 


0.01741 


-0.11408 


Fe5270 


-0.4900 


-0.522075 


-0.1240 


-0.1365 


0.2660 


-0.18686 


-0.15191 


-0.15975 


H.K 


-0.3065 


0.117644 


0.0945 


0.2266 


0.4477 


0.48040 


-0.20626 


-0.28296 


Fe5015 


-0.2763 


-0.550090 


-0.5230 


0.1094 


-0.2198 


0.17362 


0.05076 


-0.00969 


D.B 


0.2024 


0.134962 


-0.3041 


-0.1834 


0.2342 


-0.15225 


0.66215 


-0.38279 


B.R 


-0.1338 


0.363115 


-0.2925 


-0.4631 


-0.1578 


0.13444 


-0.51784 


-0.06306 


Brie 


-0.0673 


0.614240 


-0.4327 


-0.4875 


-0.0751 


-0.08047 


0.03262 


-0.07633 


Fe5709 


0.0602 


-0.172935 


-0.1399 


-0.3042 


0.4027 


-0.14195 


0.06450 


0.74089 


oni 


-0.0311 


-0.395042 


-0.6303 


0.1494 


-0.3828 


0.25483 


0.08224 


0.13941 


J.H 


-0.0175 


-0.000856 


-0.0977 


0.3589 


-0.3548 


-0.69952 


-0.23475 


-0.11213 



Table A.2. Fitness of parameters on the cladograms obtained for 
each subset as represented by the Rescaled Consistency Index 
(RCI). 



Subset 


Order from RCI 


RCI 


4cA 


D/B logcr [MgbFe]' NaD 


0.102 0.086 0.086 0.075 


5c 


D/B logo- NaD [MgbFe]' Bri e 


0.077 0.063 0.063 0.059 0.051 


5cA 


[MgbFe]' Mg b log a D/B NaD 


0.098 0.090 0.080 0.075 0.066 


6c 


log cr D/B NaD [MgbFe]' Bri e Olll 


0.059 0.055 0.055 0.050 0.040 0.039 


6cA 


[MgbFe]' Mg b log cr D/B NaD Bri e 


0.076 0.073 0.061 0.060 0.054 0.043 


7c 


D/B logo- Olll NaD [MgbFe]' Fe5015 Bri e 


0.053 0.051 0.044 0.041 0.040 0.037 0.031 


8c 


Mg b [MgbFe]' log cr NaD D/B Olll Fe5015 


0.055 0.052 0.050 0.044 0.041 0.038 0.033 




Bri e 


0.030 


10c 


[MgbFe]' Mg b NaD log cr D/B Fe5270 Bri e 


0.075 0.055 0.044 0.042 0.034 0.033 0.025 




B-R Olll Fe5709 


0.025 0.023 0.020 



method, also called a maximum parsimony method. It does not 
use distances, because there is no assumption about the met- 
rics of the parameter space. The "characters" are instead traits, 
descriptors, observables, or properties, which can be assigned 
at least two states characterizing the evolutionary stage of the 
objects for that character. The use of this approach in astro - 
ph ysics is known as as t rocladi sti cs (for detai ls and applications, 
see iFraix-Burnet et al.L l2006bllcl 12009. 2010). Simply speaking, 
the characters here are the parameters, the (continuous) values 
of which supposedly evolve with the level of diversification of 
the objects. The maximum parsimony algorithm looks for the 
simplest arrangement of objects on a bifurcating tree. The com- 
plexity of the arrangement is measured by the total number of 
"steps" (i.e. changes in all parameter values) along the tree. 

The success of a cladistic analysis much depends on the be- 
haviour of the parameters. In particular, it is sensitive to redun- 
dancies, incompatibilities, too much variability (reversals), and 
parallel and convergent evolutions. It is thus a very good tool 
for investigating whether a given set of parameters can lead to a 
robust and pertinent diversification scenario. 

In the present study, we used the same kind of analysis as 
in our previous papers on astrocladistics. We discretized the pa- 
rameters into 30 equal-width bins, which play the role of dis- 



crete evolutionary states. This choice of 30 bins is justified by 
a fair representation of diversity, a stability of the analysis in 
the sense that the result does not depend on the number of bins, 
and a bin width roughly corresponding to t he typical order of 
magn itude of the uncertainties (i.e. 7%, see IFraix-Burnet et all 
2009). We also adopted the parsimony criterion, which con- 
sists in finding the simplest evolutionary scenario that can be 
represented on a tree. Our maximum parsimony searches were 
performed usi ng the heuristic algorithm implemented in the 
PAUP*4.0blO jSwoffordl 120031) package, with the Multi-Batch 
Paup Ratchet methocfl T he results were interpreted with the help 
of the Mesquite software (Maddison & Maddison, 2004) and the 
R-package (used for graphics and statistical analyses). 

Making cladistic analyses with different sets of parameters 
both helps to find the most robust result and gives interesting 
information on the behaviour of the parameters themselves. The 
robustness of cladograms is always difficult to assess objectively, 
so we use a criterion similar to that of other statistical distance 
analyses: if a similar result is found by using different conditions 
or methods, then it can be considered as reasonably robust. We 
applied four possible tests here: 



3 http://mathbio.sas.upenn.edu/mbpr 
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1 . The occurrence of a branching pattern among most parsimo- 
nious trees: with so few parameters, many equally parsimo- 
nious trees are found, often arbitrarily limited to 1000. The 
majority-rule consensus of all of them yields a percentage 
of occurrence for each node. The higher this percentage, the 
higher the probability that this node is "robust". 

2. The agreement of branching patterns between sub-sample 
analyses, which can be called "internal consistency": by 
making analyses of several sets of arbitrarily selected sub- 
samples, we can check whether a given pattern is present on 
trees found with larger samples, including the full tree. 

3. The comparison between different sets of parameters: any 
result should preferably not depend too much on a single 
parameter. Adding or removing a parameter should not dras- 
tically change the tree. 

4. A comparison with the results of a cluster analysis: distance- 
based methods are totally independent, so any agreement can 
instill us a fair confidence in the result. 

Since we have many more objects than parameters, a lot of 
"flying" objects are expected between different analyses, and the 
above tests should be done with statistics in mind. The first test 
is always positive in this study: percentages are higher than 70- 
75%, and most often they are above 95%. This is already an 
indication that some structure is present in the data. The other 
three tests are described below. 

The full sample of 424 galaxies was divided into three sub- 
samples with 105 objects each and a fourth one with 109 ob- 
jects. The first and fourth subsamples were found to belong ex- 
clusively to clusters 1 and 3, respectively, of the cluster analysis. 
The diversity in the first subsample is less than for the others, 
so that the resulting tree is generally less well-resolved. The two 
first subsamples were also gathered to form a 210-object sub- 
sample, as well as the two last ones that form a 214-object sub- 
sample. Analyses were performed with these six subsamples, as 
well as the full sample. We then estimated the internal consis- 
tency by comparing the seven trees two by two and by eye (with 
the help of the program cophyloplot in the R-package, which 
connects a given object between the two trees). 

This procedure was applied to each of the eight-parameter 
subsets given in Table Q] Subsets 5c, 5cA, 6cA, and 3c show 
a rather good internal consistency, 4c, 7c, and 6c that which is 
fairly good, and finally 8c and 10c that which is not so good. 

This already shows that the optimal number of parameters is 
around 5, 6, or at most 7. This is in excellent agreement with the 
PCA analysis (Sect. DOT) , 

If we compare the trees obtained with the full sample for the 
eight-parameter subsets, we find that subset 5c is very consistent 
with 6c, 7c, and 8c. In addition, 5c, 6c, 5cA, and 6cA are in good 
mutual agreement, while this is not the case for 6c, 7c, and 8c. 

In Table IA.21 we show for each tree the Rescaled 
Consistency Index (RCI) , which measures the fitness of a pa- 
rameter on the phylogeny depicted by the tree. The higher the 
RCI (indeed the closer it is to 1), the more discriminant the pa- 
rameter. In other words, parameters with higher RCI are the most 
responsible for the structure of the tree. The absolute value de- 
pends on the number of objects and parameters, so it cannot be 
used to compare trees obtained with different data. Here, we can 
only use it to compare parameters for a given tree. In Table lA~2l 
the parameters are ordered according to RCI. 

When Mgb and [MgbFe]' are present together in a subset, 
they dominate the shape of the tree (sets 5c A, 6cA, 8c, and 10c), 
logcr and D/B being right after them. Mgb and [MgbFe]' are 
obviously redundant because they are very well-correlated and 
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Fig.A.l. Plots showing the jumps as defined in Sect. IA.41 Top: 
jumps for the PCA+CA analysis (Sect l4.ll >. Bottom: jumps for 
the cluster analysis with the six parameters (Sect J4.2t . 



are more or less the same measure. Hence, they cannot be used 
simultaneously in the cladistic analysis, and the trees that we find 
are more linear than the others. In contrast, log <x and D/B are 
not at all correlated, but are always together, and dominate the 
tree shape when Mg b is not present together with [MgbFe]'. In 
addition, NaD is very discriminant, and only roughly correlated 
with log cr and [MgbFe]' . 

If we compare the clusters obtained with the clustering anal- 
ysis, the agreement decreases roughly for 6c, 7c, 5 A, 3c, 5c, 4cA, 
8c, and 10c, the winner being undoubtedly 6c. The correspond- 
ing tree with the groups is shown in Fig. [3] 

A.4. Cluster analysis 

In the present study, we ado pted K-means par titioning algo- 
rithm of clustering following iMacOueenl (fl967). This method 
constructs K clusters using a distance measure (here Euclidean). 
The data are classified into K groups around K centres, such 
that the distance of a member object of any particular cluster 
(group) from its centre is minimal compared to its distance from 
the centres of the remaining groups. The requirement for the 
algorithm is that each group must contain at least one object 
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and each object must belong to exactly one group, so there 
are at most as many gro ups as ther e are o bj ects. Part it ioning 
methods are applied dWhitmorel 1 19841 iMurtaghl |1987| 
Chatt opadhvav & C hattopad hvavM2006l. l2007t iBabuetall 
2009; Chattopadhv av et al.L l2009al: IChattopadhvav et aUl2010l) , 
if one wishes to classify the objects into K clusters where 
K is fixed. Cluster centres were chosen based on a group 
average m ethod, which ensures that the process is almost robust 
(Milligan,[l9H. 

To achieve an optimum choice of K, the algorithm is run 
for K = 2, 3, 4, etc. For each value of K, the value of a dis- 
tance measure d& (called the distortion) is computed as cLk = 
{lip) min x E[(XK - ck)'{xk - c£j\, which is defined as the dis- 
tance of the Xfc vector (values of the parameters) from the centre 
ck where p is the order of the xk vector. If d^ is the estimate of 
dfc at the K' h point, then the optimum number of clusters is de- 
termi ned by the sharp jump in the curve J K - (d^ vl2 - ) 
vs K dSugar & JamesL l2003h . The jumps as a function of K for 
our PCA+CA and CA analyses are shown on Fig. IA.ll 

Appendix B: Analysis with logo-, logr e , Bri e and 
Mg 2 , and error bars 

B.I. Analysis with log cr, log r e , Bri e and Mg 2 

We complemented the study presented in this paper with 
the analysis of our sample w ith the four param e ters (l og r e , 
log cr, Bri e , and Mgi) as in iFraix-Burnet et al ] (I2010I) . We 
used the same three multivariate techniques (cluster analysis, 
Miminimum Contradiction Analysis, and cladistics) as pre- 
sented in Sect. l2.2l and AppendixlAl 

The resulting tree is less structured (more galaxies lie on in- 
dividual branches) than the one obtained in the present paper us- 
ing six parameters. This can be explained by log r e and Mg2 not 
having been found to be discriminant parame ters for the consid- 
ered s ample. It is also less structured than in IFraix-Burnet etal] 
(2010) which uses the same four parameters, which is probably 
due to the problems in determining of log r e . 

To summarize the results, we show the projection of the 
three trees - the one obtained in this paper with six parame- 
ters, the one obtained her e with four parameters, and the one of 
Fraix -Burnet et al.l (1201 Oh - onto the fundamental plane (log cr vs 
Brie) without the data points (Fig. IB. lb . Globally, there is good 
agreement and the groupings are consistent. However, the pro- 
jected tree from the present Appendix departs from the other two 
in the top half of the figure. This is because this tree is less struc- 
tured than the others, so that instead of having one or two groups 
at this level, there is a sequence of single branches that makes the 
trunk of the tree to "follow" more closely individual objects. 

B.2. Influence of r e and error bars on the partitioning 

The effective radius log r e in our sample is recomputed through a 
statistical relation between the linear diameter of the galaxy (D„) 
and its velocity dispersion (c r), which was determi ned in another 
paper dBernardi et all 120021) . The reason given by lOgando et al.l 
(2008) is that, due to the very low redshift of the galaxies in 
the sample, "the conversion of re in arcseconds to kpc needs a 
reliable determination of the galaxy distance (D). Considering 
just the redshift to calculate D, we may incur in error due to 
the peculiar motion o f galaxies. Thus , we a dopted D given by 
the D„ vs o~ relation \Bernardi et al. , 2002) to calculate r e in 
kpc." However, this relation was obtained with some assump- 
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Fig. B.3. Errors in log cr, Bri e , and log r e (taken for the errors in 
D/B). 



tions (such as the identical properties of galaxies in several clus- 
ters) and introduces a dependence of log r e (through D) on log cr. 

The two radii (Fig. IB.2t are quite well-correlated with each 
other, but the dispersion is relatively large. We performed two 
cladistic analyses with the four parameters of the fundamental 
plane (log r e , log cr, Bri e , and Mgi) as above using the two deter- 
minations of the effective radius. The agreement between the two 
results is only fair. This can be explained by the relatively impor- 
tant discrepancy between the two different values of r e (median 
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with six parameters. Thick lines represent the "trunk" of the trees, while the small branches relate the trunks to the mean of each 
group. For clarity, results are compared two by two, and only the trunks are shown for the three studies on the lower right diagram. 
These are evolutionary tracks in the sense of diversification, and not the path of evolution for a single galaxy. 



difference of 10%). This however is similar to the uncertainty in 
log r e , but much larger thn for the other parameters. In addition, 
the radius or dimension of galaxies does not appear as a discrim- 
inant parameter in the study presented in this paper. Hence, it is 
not so surprising that analyses using this parameter are not very 
stable. 

We now consider the robustness of our clustering result for 
the six-parameter analysis when taking error measurements into 
account. It is statistically a very challenging task to assess the 
influence of the errors. However, cladistics can easily take into 



account the error bars since the optimisation criterion in all anal- 
yses performed so far in astrocladistics use the parsimony crite- 
rion: among all the possible arrangements of the objects on trees, 
the simplest evolutionary scenario is retained. The parcimony is 
measured by using the number of "steps", that is the total num- 
ber of changes in parameter values along all the branches of the 
tree. If a missing value or an uncertain one (given by a range 
of values) is included in the data matrix, all possible values are 
considered and the ones corresponding to the simplest tree is 
favored. This simply increases the number of possible cases to 
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Fig. B.4. The most parsimonious tree found with cladistics tak- 
ing uncertainties in the parameters into account. The colours cor- 
respond to the groups defined in Fig. [3] 

consider. We note that all possible values within the range all- 
lowed by measurement uncertainties are given the same weight, 
whereas the probability distribution is generally expected to be 
higher at the central value (ideally gaussian). 

We performed a cladisti c analysis similar to tha t in Fig , fus- 
ing the error bars given in Qg ando et al.l(l2008b and lAlonso et alj 
(120031) for log cr and B ri?, and for D/B we considered the er- 
ror given for log r e in lAlonso et al. (2003), There errors are 
shown in Fig. IB. 31 For NaD, [MgbFe]', and OIII, we assumed 
a face value of 10% , which is the upper limit estimated by 
Qgan do et al.l ((2008) for all the Lick index values. 

The resulting tree shown in Fig. IB .41 is slightly less struc- 
tured than the one in Fig. [3] but most groups are grossly pre- 
served. Clad3 appears to be mixed with Cladl and Clad5 to be 
mixed with Clad6. In addition, Clad7 and Clad8 are somewhat 
mixed with each other. Interestingly, these behaviours are sim- 
ilar to those inferred from the comparison with the partitioning 
derived from the cluster analysis. In addition, the agreement is 
quite satisfactory given the large uncertainties for half of the pa- 
rameters (the Lick indices), a face value given to these uncer- 
tainties, and the equal probability given to all values within the 
range of uncertainty. 

These results shows that the cladistic analysis is relatively 
robust to measurement errors, as found through the comparison 
with different clustering methods. 

Appendix C: Supplementary figures 
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Fig. C.l. Same boxplots as in Fig.|4]but for the cluster partition- 
ing. Colours are the one given in Fig. [2] 
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